# Cohort Integration
The Cohort Integration method uses as input somatic variants annotated with EA. For each gene, the method determines if the distribution of EA scores for the single nucleotide variants (SNVs) of that gene differs from the distribution of EA scores for all somatic SNVs of that gene. The EA distribution of the somatic SNVs may indicate loss of function (sLOF) when skewed to large EA values or gain of function (sGOF) when skewed to intermediate EA values. The method also accounts for the frequency of other types of functionally impactful somatic variants within each gene (i.e. stop loss, in-frame indels, frameshift indels).


# Installation
To install the Cohort Integration software:
  1. Download the cohort_integration.yml, CohortInteg_SupplementalMaterial.py, and the supporting files from http://cohort.lichtargelab.org/installation
  2. Install the python version 2.7 and the packages found in cohort_integration.yml
  3. Extract the supporting files and place them in the directory that contains the CohortInteg_SupplementalMaterial.py script

# Example 
An example input file (example_input.ANNOVAR_EA) can be downloaded from http://cohort.lichtargelab.org/installation
This example input file is a subset of variants from TCGA BLCA samples that have been ANNOVAR annotated with EA scores, RefSeq NMIDs, variant classifications, and the corresponding amino acid substitutions. 


# Run
  1. Create:
	-input_directory: contains the input file (example_input.ANNOVAR_EA)
	-output_directory: user defined location of where output files will be saved
  2. Run the terminal command: python CohortInteg_SupplementalMaterial.py input_directory output_directory



